| Parameter | Value | Variable |
|---|---|---|
| Series Code | 21864 | ts |
| Maximum Lag | 48 | af.lags |
| Prevision Horizon | 12 | n.ahead |
| Unit Root Test | ADF, drift, 6, BIC, 5pct | ur.test |
| ARCH Test | 12, FALSE, 0.05 | arch.test |
| Box Test | 1, Ljung-Box, 0 | box.test |
| Dummy | FALSE | dummy |
library(BETS)
data = BETS.get(code)
info <- BETS.search(code = ts, view = F)
print(params$teste)
## NULL
| Code | Description | Periodicity | Start | Source | Unit | NA |
|---|---|---|---|---|---|---|
| 21864 | Physical Production - Intermediate goods | Index | 2002.1 | 01/01/2002 | mar/2017 | IBGE |
##
## 'mFilter' version: 0.1-3
##
## 'mFilter' is a package for time series filtering
##
## See 'library(help="mFilter")' for details
##
## Author: Mehmet Balcilar, mbalcilar@yahoo.com
## library(mFilter)
trend = fitted(hpfilter(data))
library(dygraphs)
dygraph(cbind(Series = data, Trend = trend), main = info[,"Description"]) %>%
dyRangeSelector(strokeColor = "gray", fillColor = "gray") %>%
dyAxis("y", label = info[,"Unit"])
test.params = append(list(y = data), ur.test)
df = do.call(BETS.ur_test,test.params)
df$results
## statistic crit.val rej.H0
## tau2 -4.118928 -2.88 yes
## phi1 8.488818 4.63 no
For a 95% confidence interval, the test statistic tau3 is smaller than the critical value. We therefore conclude that there is no non-seasonal unit root.
ns_roots = 0
d_ts = data
This test will be performed for lag 12, that is, the frequency of the series 21864.
library(forecast)
s_roots = nsdiffs(data)
print(s_roots)
## [1] 0
According to the OCSB test, there is no seasonal unit root, at least at a 5% significance level.
BETS.corrgram(d_ts, lag.max = af.lags, mode = "bartlett", knit = T)
BETS.corrgram(d_ts, lag.max = af.lags, mode = "simple", type = "partial", knit = T)
The correlograms from last section gives us enough information to try to identify the underlying SARIMA model parameters. We can confirm our guess by running the auto.arima function from the package forecast. By default, this function uses the AICc (Akaike Information Criterion with Finite Sample Correction) for model selection. Here, we are going to use BIC (Bayesian Information Criterion), in which the penalty term for the number of parameters in the model is larger than in AIC.
model <- auto.arima(data, ic = tolower(inf.crit), test = tolower(ur.test$mode),
max.d = ns_roots, max.D = s_roots)
summary(model)
## Series: data
## ARIMA(3,0,1)(1,0,0)[12] with non-zero mean
##
## Coefficients:
## ar1 ar2 ar3 ma1 sar1 mean
## 1.3585 -0.1123 -0.3139 -0.6598 0.8484 89.5756
## s.e. 0.1543 0.1741 0.0710 0.1557 0.0368 5.1006
##
## sigma^2 estimated as 7.783: log likelihood=-469.63
## AIC=953.25 AICc=953.87 BIC=975.98
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.1335656 2.745405 2.078356 0.06256712 2.227039 0.4780917
## ACF1
## Training set -0.02283119
We see that, according to BIC, the best model is a SARIMA(3,0,1)(1,0,0)[12]. Nevertheless, this is not the end. We still have to test for heteroskedasticity in the residuals. We can use an ARCH test with this purpose.
arch.params <- append(list(x = resid(model)), arch.test)
at <- do.call(BETS.arch_test, arch.params)
at
## statistic p.value htk
## 1 20.00048 0.06707697 TRUE
The p.value of 0.07 is larger than the significance level of 0.05. We therefore conclude that the residuals are heteroskedastic. This means we should have built the SARIMA model for the log of the original series.
The next step, then, is to rebuild the model using log(data).
data <- log(data)
ns_roots = 0
d_ts = data
test.params = append(list(y = d_ts), ur.test)
df = do.call(BETS.ur_test,test.params)
while(df$results[1,"statistic"] > df$results[1,"crit.val"]){
ns_roots = ns_roots + 1
d_ts = diff(d_ts)
test.params = append(list(y = d_ts), ur.test)
df = do.call(BETS.ur_test,test.params)
}
ns_roots
## [1] 0
s_roots = nsdiffs(data)
print(s_roots)
## [1] 0
if(s_roots != 0) {
d_ts <- diff(d_ts, lag = frequency(data), differences = s_roots)
}
s_roots
## [1] 0
BETS.corrgram(d_ts, lag.max = af.lags, mode = "bartlett", knit = T)
BETS.corrgram(d_ts, lag.max = af.lags, mode = "simple", type = "partial", knit = T)
model = auto.arima(data, ic = tolower(inf.crit), test = tolower(ur.test$mode),
max.d = ns_roots, max.D = s_roots)
summary(model)
## Series: data
## ARIMA(1,0,0)(1,0,0)[12] with non-zero mean
##
## Coefficients:
## ar1 sar1 mean
## 0.8559 0.8404 4.4873
## s.e. 0.0368 0.0362 0.0759
##
## sigma^2 estimated as 0.001003: log likelihood=379.68
## AIC=-751.36 AICc=-751.15 BIC=-738.37
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set 0.001214914 0.03142528 0.02416248 0.02253483 0.5333905
## MASE ACF1
## Training set 0.519988 -0.1226789
The best model seems to be a SARIMA(1,0,0)(1,0,0)[12]
arch.params <- append(list(x = resid(model)), arch.test)
at <- do.call(BETS.arch_test, arch.params)
at
## statistic p.value htk
## 1 28.02658 0.005482736 FALSE
We are led to conclude that the residuals of the second model are not heteroskedastic, which means this model is a better choice compared to the first.
The next function outputs the model’s standardized residuals. If they are all inside the confidence interval, it means the behaviour of the series was well captured by the model. If few residuals are outside the confidence interval, using a dummy to handle structural breaks should be considered. But if most residuals are outside the interval, then SARIMA might not be the appropriate choice.
rsd <- BETS.std_resid(model, alpha = 0.01)
As a final step in model evaluation, we are going to apply the Ljung-Box test to check for autocorrelation in the residuals.
test.params <- append(list(x = resid(model)), box.test)
bt = do.call(Box.test,test.params)
bt$p.value
## [1] 0.08831017
The p-value is greater than 0.05, which means there is not enough statistical evidence to reject the null hypothesis of no autocorrelation in the residuals. This is a desirable result.
BETS.predict(model,h=n.ahead, main = info[,"Description"], ylab = info[,"Unit"], knit = T)